Automatic Taxonomy Extraction from Query Logs with no Additional Sources of Information

نویسندگان

  • Miguel Fernández-Fernández
  • Daniel Gayo-Avello
چکیده

Search engine logs store detailed information on Web users interactions. Thus, as more and more people use search engines on a daily basis, important trails of users common knowledge are being recorded in those files. Previous research has shown that it is possible to extract concept taxonomies from full text documents, while other scholars have proposed methods to obtain similar queries from query logs. We propose a mixture of both lines of research, that is, mining query logs not to find related queries nor query hierarchies, but actual term taxonomies that could be used to improve search engine effectiveness and efficiency. As a result, in this study we have developed a method that combines lexical heuristics with a supervised classification model to successfully extract hyponymy relations from specialization search patterns revealed from log missions, with no additional sources of information, and in a language independent way.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Hierarchical Taxonomy Extraction by Mining Topical Query Sessions

Search engine logs store detailed information on Web users interactions. Thus, as more and more people use search engines on a daily basis, important trails of users common knowledge are being recorded in those files. Previous research has shown that it is possible to extract concept taxonomies from full text documents, while other scholars have proposed methods to obtain similar queries from q...

متن کامل

What You Seek Is What You Get: Extraction of Class Attributes from Query Logs

Within the larger area of automatic acquisition of knowledge from the Web, we introduce a method for extracting relevant attributes, or quantifiable properties, for various classes of objects. The method extracts attributes such as capital city and President for the class Country, or cost, manufacturer and side effects for the classDrug, without relying on any expensive language resources or co...

متن کامل

A new query expansion method based on query logs mining

Query expansion has long been suggested as an effective way to improve the performance of information retrieval systems by adding additional relevant terms to the original queries. However, most previous research has been limited in extracting new terms from a subset of relevant documents, but has not exploited the information about user interactions. In this paper, we proposed a method for aut...

متن کامل

Weakly-Supervised Acquisition of Open-Domain Classes and Class Attributes from Web Documents and Query Logs

A new approach to large-scale information extraction exploits both Web documents and query logs to acquire thousands of opendomain classes of instances, along with relevant sets of open-domain class attributes at precision levels previously obtained only on small-scale, manually-assembled classes.

متن کامل

Weakly-Supervised Acquisition of Open-Domain Classes and Class Attributes from Web Documents and Query Logs

A new approach to large-scale information extraction exploits both Web documents and query logs to acquire thousands of opendomain classes of instances, along with relevant sets of open-domain class attributes at precision levels previously obtained only on small-scale, manually-assembled classes.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1510.00618  شماره 

صفحات  -

تاریخ انتشار 2015